Nakuru
TEMPO: Global Temporal Building Density and Height Estimation from Satellite Imagery
Glazer, Tammy, Hacheme, Gilles Q., Zaytar, Akram, Marotti, Luana, Michaels, Amy, Tadesse, Girmaw Abebe, White, Kevin, Dodhia, Rahul, Zolli, Andrew, Becker-Reshef, Inbal, Ferres, Juan M. Lavista, Robinson, Caleb
We present TEMPO, a global, temporally resolved dataset of building density and height derived from high-resolution satellite imagery using deep learning models. We pair building footprint and height data from existing datasets with quarterly PlanetScope basemap satellite images to train a multi-task deep learning model that predicts building density and building height at a 37.6-meter per pixel resolution. We apply this model to global PlanetScope basemaps from Q1 2018 through Q2 2025 to create global, temporal maps of building density and height. We validate these maps by comparing against existing building footprint datasets. Our estimates achieve an F1 score between 85% and 88% on different hand-labeled subsets, and are temporally stable, with a 0.96 five-year trend-consistency score. TEMPO captures quarterly changes in built settlements at a fraction of the computational cost of comparable approaches, unlocking large-scale monitoring of development patterns and climate impacts essential for global resilience and adaptation efforts.
- Africa > Sudan (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Asia > Southeast Asia (0.04)
- (13 more...)
- Asia > China > Shanghai > Shanghai (0.06)
- North America > United States > New York > New York County > New York City (0.05)
- Asia > China > Zhejiang Province > Hangzhou (0.04)
- (7 more...)
- Transportation > Infrastructure & Services (0.93)
- Transportation > Ground > Road (0.68)
Unlocking Location Intelligence: A Survey from Deep Learning to The LLM Era
Hao, Xixuan, Jiang, Yutian, Zou, Xingchen, Liu, Jiabo, Yin, Yifang, Liang, Yuxuan
Location Intelligence (LI), the science of transforming location-centric geospatial data into actionable knowledge, has become a cornerstone of modern spatial decision-making. The rapid evolution of Geospatial Representation Learning is fundamentally reshaping LI development through two successive technological revolutions: the deep learning breakthrough and the emerging large language model (LLM) paradigm. While deep neural networks (DNNs) have demonstrated remarkable success in automated feature extraction from structured geospatial data (e.g., satellite imagery, GPS trajectories), the recent integration of LLMs introduces transformative capabilities for cross-modal geospatial reasoning and unstructured geo-textual data processing. This survey presents a comprehensive review of geospatial representation learning across both technological eras, organizing them into a structured taxonomy based on the complete pipeline comprising: (1) data perspective, (2) methodological perspective and (3) application perspective. We also highlight current advancements, discuss existing limitations, and propose potential future research directions in the LLM era. This work offers a thorough exploration of the field and providing a roadmap for further innovation in LI. The summary of the up-to-date paper list can be found in https://github.com/CityMind-Lab/Awesome-Location-Intelligence and will undergo continuous updates.
- Asia > China > Beijing > Beijing (0.05)
- North America > United States > District of Columbia > Washington (0.05)
- Asia > China > Shanghai > Shanghai (0.05)
- (29 more...)
- Transportation (0.67)
- Banking & Finance > Real Estate (0.46)
- Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.38)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
RideKE: Leveraging Low-Resource, User-Generated Twitter Content for Sentiment and Emotion Detection in Kenyan Code-Switched Dataset
Etori, Naome A., Gini, Maria L.
Social media has become a crucial open-access platform for individuals to express opinions and share experiences. However, leveraging low-resource language data from Twitter is challenging due to scarce, poor-quality content and the major variations in language use, such as slang and code-switching. Identifying tweets in these languages can be difficult as Twitter primarily supports high-resource languages. We analyze Kenyan code-switched data and evaluate four state-of-the-art (SOTA) transformer-based pretrained models for sentiment and emotion classification, using supervised and semi-supervised methods. We detail the methodology behind data collection and annotation, and the challenges encountered during the data curation phase. Our results show that XLM-R outperforms other models; for sentiment analysis, XLM-R supervised model achieves the highest accuracy (69.2\%) and F1 score (66.1\%), XLM-R semi-supervised (67.2\% accuracy, 64.1\% F1 score). In emotion analysis, DistilBERT supervised leads in accuracy (59.8\%) and F1 score (31\%), mBERT semi-supervised (accuracy (59\% and F1 score 26.5\%). AfriBERTa models show the lowest accuracy and F1 scores. All models tend to predict neutral sentiment, with Afri-BERT showing the highest bias and unique sensitivity to empathy emotion. https://github.com/NEtori21/Ride_hailing
- Africa > Kenya > Nairobi City County > Nairobi (0.07)
- Africa > Kenya > Nairobi Province (0.06)
- Africa > Kenya > Mombasa County > Mombasa (0.05)
- (18 more...)
- Transportation > Passenger (1.00)
- Information Technology (1.00)
- Transportation > Ground > Road (0.93)
Building low-resource African language corpora: A case study of Kidawida, Kalenjin and Dholuo
Mbogho, Audrey, Awuor, Quin, Kipkebut, Andrew, Wanzare, Lilian, Oloo, Vivian
Natural Language Processing is a crucial frontier in artificial intelligence, with broad applications in many areas, including public health, agriculture, education, and commerce. However, due to the lack of substantial linguistic resources, many African languages remain underrepresented in this digital transformation. This paper presents a case study on the development of linguistic corpora for three under-resourced Kenyan languages, Kidaw'ida, Kalenjin, and Dholuo, with the aim of advancing natural language processing and linguistic research in African communities. Our project, which lasted one year, employed a selective crowd-sourcing methodology to collect text and speech data from native speakers of these languages. Data collection involved (1) recording conversations and translation of the resulting text into Kiswahili, thereby creating parallel corpora, and (2) reading and recording written texts to generate speech corpora. We made these resources freely accessible via open-research platforms, namely Zenodo for the parallel text corpora and Mozilla Common Voice for the speech datasets, thus facilitating ongoing contributions and access for developers to train models and develop Natural Language Processing applications. The project demonstrates how grassroots efforts in corpus building can support the inclusion of African languages in artificial intelligence innovations. In addition to filling resource gaps, these corpora are vital in promoting linguistic diversity and empowering local communities by enabling Natural Language Processing applications tailored to their needs. As African countries like Kenya increasingly embrace digital transformation, developing indigenous language resources becomes essential for inclusive growth. We encourage continued collaboration from native speakers and developers to expand and utilize these corpora.
- Africa > South Sudan (0.14)
- Africa > Uganda (0.05)
- North America > United States (0.04)
- (17 more...)
- Health & Medicine (0.67)
- Media > News (0.46)
Uchaguzi-2022: A Dataset of Citizen Reports on the 2022 Kenyan Election
Mondini, Roberto, Kotonya, Neema, Logan, Robert L. IV, Olson, Elizabeth M, Lungati, Angela Oduor, Odongo, Daniel Duke, Ombasa, Tim, Lamba, Hemank, Cahill, Aoife, Tetreault, Joel R., Jaimes, Alejandro
Online reporting platforms have enabled citizens around the world to collectively share their opinions and report in real time on events impacting their local communities. Systematically organizing (e.g., categorizing by attributes) and geotagging large amounts of crowdsourced information is crucial to ensuring that accurate and meaningful insights can be drawn from this data and used by policy makers to bring about positive change. These tasks, however, typically require extensive manual annotation efforts. In this paper we present Uchaguzi-2022, a dataset of 14k categorized and geotagged citizen reports related to the 2022 Kenyan General Election containing mentions of election-related issues such as official misconduct, vote count irregularities, and acts of violence. We use this dataset to investigate whether language models can assist in scalably categorizing and geotagging reports, thus highlighting its potential application in the AI for Social Good space.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Africa > Kenya > Bomet County > Bomet (0.05)
- (34 more...)
Farmer.Chat: Scaling AI-Powered Agricultural Services for Smallholder Farmers
Singh, Namita, Wang'ombe, Jacqueline, Okanga, Nereah, Zelenska, Tetyana, Repishti, Jona, K, Jayasankar G, Mishra, Sanjeev, Manokaran, Rajsekar, Singh, Vineet, Rafiq, Mohammed Irfan, Gandhi, Rikin, Nambi, Akshay
Small and medium-sized agricultural holders face challenges like limited access to localized, timely information, impacting productivity and sustainability. Traditional extension services, which rely on in-person agents, struggle with scalability and timely delivery, especially in remote areas. We introduce Farmer.Chat, a generative AI-powered chatbot designed to address these issues. Leveraging Generative AI, Farmer.Chat offers personalized, reliable, and contextually relevant advice, overcoming limitations of previous chatbots in deterministic dialogue flows, language support, and unstructured data processing. Deployed in four countries, Farmer.Chat has engaged over 15,000 farmers and answered over 300,000 queries. This paper highlights how Farmer.Chat's innovative use of GenAI enhances agricultural service scalability and effectiveness. Our evaluation, combining quantitative analysis and qualitative insights, highlights Farmer.Chat's effectiveness in improving farming practices, enhancing trust, response quality, and user engagement.
- North America > United States > Texas > Crockett County (0.04)
- Asia > Japan > Honshū > Chūbu > Toyama Prefecture > Toyama (0.04)
- Africa > Kenya > Nyeri County > Nyeri (0.04)
- (21 more...)
- Research Report > New Finding (1.00)
- Questionnaire & Opinion Survey (0.95)
- Research Report > Experimental Study (0.67)
- Health & Medicine (1.00)
- Food & Agriculture > Agriculture (1.00)
- Education (1.00)
Weakly Supervised Detection of Hallucinations in LLM Activations
Rateike, Miriam, Cintas, Celia, Wamburu, John, Akumu, Tanya, Speakman, Skyler
We propose an auditing method to identify whether a large language model (LLM) encodes patterns such as hallucinations in its internal states, which may propagate to downstream tasks. We introduce a weakly supervised auditing technique using a subset scanning approach to detect anomalous patterns in LLM activations from pre-trained models. Importantly, our method does not need knowledge of the type of patterns a-priori. Instead, it relies on a reference dataset devoid of anomalies during testing. Further, our approach enables the identification of pivotal nodes responsible for encoding these patterns, which may offer crucial insights for fine-tuning specific sub-networks for bias mitigation. We introduce two new scanning methods to handle LLM activations for anomalous sentences that may deviate from the expected distribution in either direction. Our results confirm prior findings of BERT's limited internal capacity for encoding hallucinations, while OPT appears capable of encoding hallucination information internally. Importantly, our scanning approach, without prior exposure to false statements, performs comparably to a fully supervised out-of-distribution classifier.
- Africa > Kenya > Nairobi Province (0.04)
- Africa > Kenya > Nairobi City County > Nairobi (0.04)
- Europe > Germany > Saarland (0.04)
- Africa > Kenya > Nakuru County > Nakuru (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
BART-SIMP: a novel framework for flexible spatial covariate modeling and prediction using Bayesian additive regression trees
Jiang, Alex Ziyu, Wakefield, Jon
Prediction is a classic challenge in spatial statistics and the inclusion of spatial covariates can greatly improve predictive performance when incorporated into a model with latent spatial effects. It is desirable to develop flexible regression models that allow for nonlinearities and interactions in the covariate structure. Machine learning models have been suggested in the spatial context, allowing for spatial dependence in the residuals, but fail to provide reliable uncertainty estimates. In this paper, we investigate a novel combination of a Gaussian process spatial model and a Bayesian Additive Regression Tree (BART) model. The computational burden of the approach is reduced by combining Markov chain Monte Carlo (MCMC) with the Integrated Nested Laplace Approximation (INLA) technique. We study the performance of the method via simulations and use the model to predict anthropometric responses, collected via household cluster samples in Kenya.
- North America > United States (0.46)
- Africa > Kenya > Nairobi City County > Nairobi (0.04)
- Africa > Kenya > Mombasa County > Mombasa (0.04)
- (25 more...)
- Research Report > New Finding (0.46)
- Research Report > Experimental Study (0.46)